Challenges, Techniques and Directions in Building XSeek: an XML Search Engine

نویسندگان

  • Ziyang Liu
  • Peng Sun
  • Yu Huang
  • Yichuan Cai
  • Yi Chen
چکیده

The importance of supporting keyword searches on XML data has been widely recognized. Different from structured queries, keyword searches are inherently ambiguous due to the inability/unwillingness of users to specify pinpoint semantics. As a result, processing keyword searches involves many unique challenges. In this paper we discuss the motivation, desiderata and challenges in supporting keyword searches on XML data. Then we present an XML keyword search engine, XSeek, which addresses the challenges in several aspects: identifying explicit relevant nodes, identifying implicit relevant nodes, and generating result snippets. At last we discuss the remaining issues and future research directions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

XSeek: A Semantic XML Search Engine Using Keywords

We present XSeek, a keyword search engine that enables users to easily access XML data without the need of learning XPath or XQuery and studying possibly complex data schemas. XSeek addresses a challenge in XML keyword search that has been neglected in the literature: how to determine the desired return information, analogous to inferring a “return” clause in XQuery. To infer the search semanti...

متن کامل

MAXLCA: A New Query Semantic Model for XML Keyword Search

Keyword search enables web users to easily access XML data without understanding the complex data schemas. However, the ambiguity of keyword search makes it arduous to select qualified data nodes matching keywords. To address this challenge in XML datasets whose documents have a relatively low average size, we present a new keyword query semantic model: MAXimal Lowest Common Ancestor (MAXLCA). ...

متن کامل

Guess What I Want: Inferring the Semantics of Keyword Queries Using Evidence Theory

The tagged and nested structure of an XML document provides quite detailed information about its structure and semantic, which is neglected by traditional keyword search model like TF-IDF and BM25 etc. Popular XML search models such as SLCA and XRANK tend to return the “deepest” node containing all given keywords, which usually leads to semantic loss. In this paper, we introduce the concept of ...

متن کامل

The Web in Ten Years: Challenges and Opportunities for Database Research

In order to evolve into a dependable and ubiquitous information infrastructure, the World Wide Web needs comprehensive quality, performance, and availability guarantees for all kinds of E-services including search engines. To improve the search result quality of search engines and to exploit the Web’s potential as a world-wide knowledge base, intensive research efforts are required that center ...

متن کامل

Semantic Based XML Context Driven Search And Retrieval System

we present in this paper, a context-driven search engine called XCD Search for answering XML Keyword-based queries as well as Loosely Structured queries, using a stack-based sort-merge algorithm. Most current research is focused on building relationships between data elements based solely on their labels and proximity to one another, while overlooking the contexts of the elements, which may lea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Data Eng. Bull.

دوره 32  شماره 

صفحات  -

تاریخ انتشار 2009